Coarse grain parallel codes for solving sparse systems of linear algebraic equations can be developed in several different ways. The following procedure is suitable for some parallel computers. A preliminary reordering of the matrix is first applied to move as many zero elements as possible to the lower left corner. After that the matrix is partitioned into large blocks and the blocks in the lower left corner contain only zero elements. An attempt to obtain a good load-balance is carried out by allowing the diagonal blocks to be rectangular.
While the algorithm based on the above ideas has good parallel properties, some stability problems may arise during the factorization because the pivotal search is restricted to the diagonal blocks. A simple a priori procedure has been used in a previous version in an attempt to stabilize the algorithm. In this paper it is shown that three enhanced stability devices can successfully be incorporated in the algorithm so that it is further stabilized and, moreover, the parallel properties of the original algorithm are preserved.
The first device is based on a dynamic check of the stability. In the second device a slightly modified reordering is used in an attempt to get more nonzero elements in the diagonal blocks (the number of candidates for pivots tends to increase in this situation and, therefore, there is a better chance to select more stable pivots). The third device applies a P5-like ordering as a secondary criterion in the basic reordering procedure. This tends to improve the reordering and the performance of the solver. Moreover, the device is stable, while the original P5 ordering is often unstable.
Numerical results obtained by using the three new devices are presented. The well-known sparse matrices from the Harwell-Boeing set are used in the experiments. 相似文献
Three currently available concurrent language systems, Pascal-Plus, occam and Edison, are used to implement a controller for a robot arm. The robot arm allows real parallelism of operation within the movements of the arm. The feasibility and restrictions placed upon the resultant solution for each of the language systems is then analysed and discussed. A Petri-net solution is also presented for the generalized problem and it is shown that each of the solutions is a different folding of the general net. 相似文献
Some of the modern powerful digital signal processors (DSPs) have byte-addressable internal data memory. This property is valuable especially in computationally demanding inter frame video encoding, where data accesses are typically unaligned according to word boundaries. The byte-addressable memory allows load or store command to start accessing from any byte-address, providing at most as many successive bytes from subsequent addresses as data bus can handle in parallel. Maybe the simplest way to construct such a byte-addressable memory is to use N 8-bit memory modules or banks to be accessed in parallel, when N is data bus width in bytes. However, in addition to byte-addressable subsequent bytes, memory consisting of parallel memory modules can provide much more versatile addressing capabilities with reasonable implementation cost. Versatile access formats can significantly reduce the need for data reordering in the register file. At first, we provide motivation for using parallel memory architecture with versatile access formats as an internal on-chip data memory of modern DSP. After this, notations are described and general view of parallel memory design is given. We propose some example parallel data memory architecture designs with data access formats especially helpful in H.263 encoding and MPEG-4 core profile motion and texture encoding. The examples are given for different data bus widths (16, 32, 64, and 128 bits). Finally, performance is shortly compared to other memory architectures and area, delay, and power figures are estimated.Jarno K. Tanskanen was born in Joensuu, Finland in 1975. He studied analog and digital electronics in the Department of Electrical Engineering, and computer architecture in the Department of Information Technology at Tampere University of Technology, where he received his M.Sc. degree in 1999. He is currently working as a research scientist in the Institute of Digital and Computer Systems at TUT. His Dr.Tech. research concerns parallel processing of video compression. jarno.tanskanen@tut.fiReiner Creutzburg received his Diploma in Mathematics in 1976 and attained his Ph.D. in Mathematics in 1984 from the Rostock University, Germany. Prof. Creutzburg has published 3 books, filed 2 patents, and produced approximately 100 articles, preprint, and conference papers. Professional Experience: Since 2000—Part-time Professor for Multimedia technology, Tampere University of Technology, Finland. Since 1992—Full-time Professor of Computer Science, Fachhochschule Brandenburg-University of Applied Sciences, Brandenburg, Germany. 1990 to 1992—Assistant Professor, University of Karlsruhe, Institute of Algorithms and Cognitive Systems, Germany. 1987 to 1989—Head of the Research Section Image Processing. 1986 to 1989—Founder and Head of the International Base Laboratory of Image Processing and Computer Graphics for East European countries at the Central Institute of Cybernetics and Information Processes of the Academy of Sciences (Berlin), Germany. 1976 to 1989—Researcher and Assistant Professor in various Universities and the Academy of Sciences, Central Institute of Cybernetics and Information, Berlin. creutzburg@fh-brandenburg.deJarkko T. Niittylahti was born in Orivesi, Finland, in 1962. He received the M.Sc, Lic.Tech, and Dr.Tech degrees at Tampere University of Technology (TUT) in 1988, 1992, and 1995, respectively. From 1987 to 1992, he was a researcher at TUT. In 1992–93, he was a researcher at CERN in Geneva, Switzerland. In 1993–95, he was with Nokia Consumer Electronics, Bochum, Germany, and in 1995–97 with Nokia Research Center, Tampere, Finland. In 1997–2000, he was a Professor at Signal Processing Laboratory, TUT, and in 2000–2002 at Institute of Digital and Computer Systems, TUT. Currently, he is a Docent of Digital Techniques at TUT and the managing director of Staselog Ltd. He is also a co-founder and President of Atostek Ltd. He is interested in designing digital systems and architectures. jarkko.niittylahti@tut.fi 相似文献